Skip to content

Python: Implement message filtering in agent workflow to stop emitting already seen responses#4268

Open
leelakarthik wants to merge 7 commits intomicrosoft:mainfrom
leelakarthik:patch-6
Open

Python: Implement message filtering in agent workflow to stop emitting already seen responses#4268
leelakarthik wants to merge 7 commits intomicrosoft:mainfrom
leelakarthik:patch-6

Conversation

@leelakarthik
Copy link

@leelakarthik leelakarthik commented Feb 25, 2026

Summary

Fixes #4261

When a GroupChat orchestrator terminates, it yields self._full_conversation
(the entire conversation history) via ctx.yield_output(). Without filtering,
WorkflowAgent converts all messages — including user inputs and earlier assistant
responses — into output, causing them to compound across successive turns.

Root Cause

The termination path in BaseGroupChatOrchestrator calls
ctx.yield_output(self._full_conversation) which is a list[Message]. This hits
the list[Message] branch in both _convert_workflow_event_to_agent_response_updates
(streaming) and _convert_workflow_events_to_agent_response (non-streaming), where
all messages were forwarded without filtering.

Fix

Added _filter_messages() that returns only the last meaningful assistant message
from list[Message] output. This is intentionally aggressive because:

  1. User messages — re-emitting these as assistant output is the reported bug
  2. Earlier assistant messages — already surfaced in real-time via streaming;
    including them again causes duplication
  3. Default behavior alignmentGroupChatBuilder defaults to
    intermediate_outputs=False. Users who need intermediate responses can opt in
    via intermediate_outputs=True

Relationship to #4275

PR #4275 filters the AgentResponse branch. This PR filters the list[Message]
branch — the actual code path hit by GroupChat's termination yield. Both fixes are
complementary.

Tests

  • Updated test_workflow_as_agent_yield_output_with_list_of_chat_messages to
    reflect new filtering semantics
  • Added TestWorkflowAgentUserInputFilteringRegression with multi-turn compounding
    regression tests for streaming and non-streaming paths

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

@github-actions github-actions bot changed the title Implement message filtering in agent workflow to stop emitting already seen responses Python: Implement message filtering in agent workflow to stop emitting already seen responses Feb 25, 2026
@leelakarthik
Copy link
Author

@leelakarthik please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

Contribution License Agreement

This Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”), and conveys certain license rights to Microsoft Corporation and its affiliates (“Microsoft”) for Your contributions to Microsoft open source projects. This Agreement is effective as of the latest signature date below.

  1. Definitions.
    “Code” means the computer software code, whether in human-readable or machine-executable form,
    that is delivered by You to Microsoft under this Agreement.
    “Project” means any of the projects owned or managed by Microsoft and offered under a license
    approved by the Open Source Initiative (www.opensource.org).
    “Submit” is the act of uploading, submitting, transmitting, or distributing code or other content to any
    Project, including but not limited to communication on electronic mailing lists, source code control
    systems, and issue tracking systems that are managed by, or on behalf of, the Project for the purpose of
    discussing and improving that Project, but excluding communication that is conspicuously marked or
    otherwise designated in writing by You as “Not a Submission.”
    “Submission” means the Code and any other copyrightable material Submitted by You, including any
    associated comments and documentation.
  2. Your Submission. You must agree to the terms of this Agreement before making a Submission to any
    Project. This Agreement covers any and all Submissions that You, now or in the future (except as
    described in Section 4 below), Submit to any Project.
  3. Originality of Work. You represent that each of Your Submissions is entirely Your original work.
    Should You wish to Submit materials that are not Your original work, You may Submit them separately
    to the Project if You (a) retain all copyright and license information that was in the materials as You
    received them, (b) in the description accompanying Your Submission, include the phrase “Submission
    containing materials of a third party:” followed by the names of the third party and any licenses or other
    restrictions of which You are aware, and (c) follow any other instructions in the Project’s written
    guidelines concerning Submissions.
  4. Your Employer. References to “employer” in this Agreement include Your employer or anyone else
    for whom You are acting in making Your Submission, e.g. as a contractor, vendor, or agent. If Your
    Submission is made in the course of Your work for an employer or Your employer has intellectual
    property rights in Your Submission by contract or applicable law, You must secure permission from Your
    employer to make the Submission before signing this Agreement. In that case, the term “You” in this
    Agreement will refer to You and the employer collectively. If You change employers in the future and
    desire to Submit additional Submissions for the new employer, then You agree to sign a new Agreement
    and secure permission from the new employer before Submitting those Submissions.
  5. Licenses.
  • Copyright License. You grant Microsoft, and those who receive the Submission directly or
    indirectly from Microsoft, a perpetual, worldwide, non-exclusive, royalty-free, irrevocable license in the
    Submission to reproduce, prepare derivative works of, publicly display, publicly perform, and distribute
    the Submission and such derivative works, and to sublicense any or all of the foregoing rights to third
    parties.
  • Patent License. You grant Microsoft, and those who receive the Submission directly or
    indirectly from Microsoft, a perpetual, worldwide, non-exclusive, royalty-free, irrevocable license under
    Your patent claims that are necessarily infringed by the Submission or the combination of the
    Submission with the Project to which it was Submitted to make, have made, use, offer to sell, sell and
    import or otherwise dispose of the Submission alone or with the Project.
  • Other Rights Reserved. Each party reserves all rights not expressly granted in this Agreement.
    No additional licenses or rights whatsoever (including, without limitation, any implied licenses) are
    granted by implication, exhaustion, estoppel or otherwise.
  1. Representations and Warranties. You represent that You are legally entitled to grant the above
    licenses. You represent that each of Your Submissions is entirely Your original work (except as You may
    have disclosed under Section 3). You represent that You have secured permission from Your employer to
    make the Submission in cases where Your Submission is made in the course of Your work for Your
    employer or Your employer has intellectual property rights in Your Submission by contract or applicable
    law. If You are signing this Agreement on behalf of Your employer, You represent and warrant that You
    have the necessary authority to bind the listed employer to the obligations contained in this Agreement.
    You are not expected to provide support for Your Submission, unless You choose to do so. UNLESS
    REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING, AND EXCEPT FOR THE WARRANTIES
    EXPRESSLY STATED IN SECTIONS 3, 4, AND 6, THE SUBMISSION PROVIDED UNDER THIS AGREEMENT IS
    PROVIDED WITHOUT WARRANTY OF ANY KIND, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTY OF
    NONINFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.
  2. Notice to Microsoft. You agree to notify Microsoft in writing of any facts or circumstances of which
    You later become aware that would make Your representations in this Agreement inaccurate in any
    respect.
  3. Information about Submissions. You agree that contributions to Projects and information about
    contributions may be maintained indefinitely and disclosed publicly, including Your name and other
    information that You submit with Your Submission.
  4. Governing Law/Jurisdiction. This Agreement is governed by the laws of the State of Washington, and
    the parties consent to exclusive jurisdiction and venue in the federal courts sitting in King County,
    Washington, unless no federal subject matter jurisdiction exists, in which case the parties consent to
    exclusive jurisdiction and venue in the Superior Court of King County, Washington. The parties waive all
    defenses of lack of personal jurisdiction and forum non-conveniens.
  5. Entire Agreement/Assignment. This Agreement is the entire agreement between the parties, and
    supersedes any and all prior agreements, understandings or communications, written or oral, between
    the parties relating to the subject matter hereof. This Agreement may be assigned by Microsoft.

@microsoft-github-policy-service agree

@leelakarthik
Copy link
Author

Updated: Added tests and clarified filtering rationale

What this fixes

Issue #4261: When a GroupChat orchestrator terminates, it yields
self._full_conversation (the entire conversation history) via
ctx.yield_output(). Without filtering, WorkflowAgent converts all
messages — including user inputs and earlier assistant responses — into
output, causing them to compound across successive turns.

Root cause analysis

The termination path in BaseGroupChatOrchestrator:

# _check_terminate_and_yield / _check_agent_terminate_and_yield
await ctx.yield_output(self._full_conversation)  # list[Message]

This hits the list[Message] branch in both
_convert_workflow_event_to_agent_response_updates (streaming) and
_convert_workflow_events_to_agent_response (non-streaming), where all
messages were forwarded without filtering.

Fix: _filter_messages

Returns only the last meaningful assistant message from list[Message]
output. This is intentionally more aggressive than just filtering
role="user" because:

  1. User messages — re-emitting these as assistant output is the
    reported bug
  2. Earlier assistant messages — these are conversation history replay
    from _full_conversation. They were already surfaced in real-time
    during workflow execution via AgentResponseUpdate streaming. Including
    them again in the termination output causes duplication.
  3. Default behavior alignmentGroupChatBuilder defaults to
    intermediate_outputs=False, meaning only the orchestrator's output
    is surfaced. Users who need intermediate agent responses can opt in
    via intermediate_outputs=True.

Relationship to #4275

PR #4275 filters the AgentResponse branch. This PR filters the
list[Message] branch, which is the actual code path hit by GroupChat's
termination yield (ctx.yield_output(self._full_conversation)). Both
fixes are complementary — they cover different branches of the same
conversion logic.

Tests added

  • Updated test_workflow_as_agent_yield_output_with_list_of_chat_messages
    to reflect the new filtering semantics (last assistant message only)

  • Added TestWorkflowAgentUserInputFilteringRegression with two
    multi-turn compounding regression tests:

    • test_streaming_compounding_not_observed_across_turns
    • test_nonstreaming_compounding_not_observed_across_turns

    These reproduce the exact escalating symptom from the issue report
    where turn 1's "hi" bled into turn 2's response.

@leelakarthik
Copy link
Author

@copilot review

Copy link
Contributor

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 3 | Confidence: 83%

✗ Correctness

The _filter_messages method is logically sound for the stated purpose of preventing user-input compounding, but it has a return-type annotation bug (| None) even though None is never returned—callers use .extend() and for … in which would crash on None. Additionally, the regression tests yield AgentResponse (not list[Message]), so they likely never exercise the _filter_messages code path that this PR adds, giving false confidence in the fix.

✗ Security Reliability

The new _filter_messages method declares a return type of list[Message] | None, but neither call site guards against None. While the current implementation never actually returns None, the misleading type annotation invites a future change that would crash both callers (messages.extend(None) and for msg in None raise TypeError). Additionally, after filtering, raw_representations.append(data) still records the original unfiltered list, creating a mismatch between the response messages and their raw representations. Finally, when all messages have role='user', the filter silently returns an empty list, dropping the workflow output with no log or error.

✗ Test Coverage

The new _filter_messages method is exercised by the updated happy-path test, but its edge cases (empty list, all-user messages, assistant messages with None/whitespace-only text triggering the fallback path) have no coverage. More critically, the two regression tests in TestWorkflowAgentUserInputFilteringRegression yield AgentResponse objects via WorkflowContext[Never, AgentResponse], not list[Message]; since _filter_messages is only invoked on the is_instance_of(data, list[Message]) branch, these regression tests likely never exercise the new filtering logic and therefore don't guard against the regression they describe. The return type annotation list[Message] | None is also misleading because the method never returns None, and callers pass the result directly to list.extend() which would raise TypeError on None.

Blocking Issues

  • The return type of _filter_messages is declared as list[Message] | None, but the function never returns None. Callers at lines 487 and 604 call .extend(chat_messages) and for msg in chat_messages: respectively, which would crash on None. More importantly, a static type-checker will flag these call-sites as unsafe. The | None should be removed from the annotation.
  • The regression tests (TestWorkflowAgentUserInputFilteringRegression) use WorkflowContext[Never, AgentResponse] and yield an AgentResponse object—not a list[Message]. The _filter_messages call is only reached inside the is_instance_of(data, list[Message]) branch, so these tests never exercise the new filtering logic. They may pass regardless of whether _filter_messages exists, providing no actual regression coverage for the fix.
  • _filter_messages declares return type list[Message] | None but both callers use the result directly in extend() and for iteration without a None check. If the method is ever modified to return None (as the type hint permits), both call sites will raise TypeError at runtime.
  • The regression tests (test_streaming_compounding_not_observed_across_turns, test_nonstreaming_compounding_not_observed_across_turns) use WorkflowContext[Never, AgentResponse] and yield an AgentResponse object. _filter_messages is only called when is_instance_of(data, list[Message]) is true, so these tests almost certainly do not exercise the new filtering code path. They should yield list[Message] (matching the existing test pattern) to actually cover the fix.
  • The fallback path of _filter_messages (where no non-user message has non-empty text) is completely untested. When all messages are user messages, the method returns an empty list, silently dropping all output — this behaviour should be tested and verified as intentional.

Suggestions

  • Consider what should happen when every message in the list has role == "user" (e.g., an echo workflow). The fallback returns [], silently dropping all output. It may be safer to return the original list unchanged in that edge case so no data is silently lost.
  • In _convert_workflow_events_to_agent_response (line 488), raw_representations.append(data) still appends the unfiltered original list while messages.extend(chat_messages) uses the filtered version. Document that this asymmetry is intentional, or filter raw_representations too, to avoid confusing downstream consumers.
  • Consider logging a warning inside _filter_messages when the returned list is empty (all messages were user-role), so silent data loss is diagnosable.
  • At line 488, raw_representations.append(data) still records the original unfiltered list while messages contains the filtered subset—verify this mismatch is intentional and won't confuse downstream consumers that correlate messages with raw_representations.
  • Add direct unit tests for _filter_messages covering: empty input list, single assistant message, all-user messages, assistant message with text=None, assistant message with whitespace-only text, and mixed roles where the last non-user message has no text (fallback path).
  • Fix the return type annotation from list[Message] | None to list[Message] since the method never returns None and callers pass the result to list.extend() which would TypeError on None.

Automated review by moonbox3's agents

leelakarthik and others added 6 commits February 26, 2026 19:25
…ng messages in response

Add a method to filter messages and update message handling.

This helps remove the user messages being duplicated/emitted in the response

This fixes the microsoft#4261 for both streaming and non streaming functions
…ge] filtering expectations (microsoft#4261)

- Update test_workflow_as_agent_yield_output_with_list_of_chat_messages to
  reflect _filter_messages semantics: only the last meaningful assistant
  message is surfaced from list[Message] output, preventing user input
  re-emission and full conversation history replay.

- Add TestWorkflowAgentUserInputFilteringRegression with multi-turn
  compounding regression tests for both streaming and non-streaming paths,
  reproducing the exact escalating symptom from microsoft#4261.

Users who need intermediate agent responses can opt in via
intermediate_outputs=True in GroupChatBuilder.
Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>
@leelakarthik
Copy link
Author

@moonbox3 All review feedback addressed — tests added, inline comment added for raw_representations asymmetry, and branch rebased on latest main. Ready for another look when you get a chance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: [Bug]: _convert_workflow_event_to_agent_response_updates emits user input again

3 participants